Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an example to the readme? #2067

Merged
merged 7 commits into from
Oct 8, 2022
Merged

Add an example to the readme? #2067

merged 7 commits into from
Oct 8, 2022

Conversation

mcabbott
Copy link
Member

@mcabbott mcabbott commented Sep 26, 2022

This proposes to add a very simple Flux model to the readme. Almost the simplest thing I could think of, which actually works.

It's nice to see code right away. I guess it won't make sense if you've never seen a neural network, but if you've ever tried any demo elsewhere, this should look familiar, and point out the main features. Including (I hope) that batch dims are last, that the model is a mutable object, how to move to GPU, what arguments train! needs (and in what order).

Without BatchNorm this works 90% of the time, with default Adam() too just 80%. So the post-1969 details are doing something.

On Julia 1.7+, just pasting this in will install everything including CUDA. Maybe it should say that. I didn't want to put too many words. Now it does.

@github-actions

This comment was marked as off-topic.

README.md Outdated

[![](https://img.shields.io/badge/docs-stable-blue.svg)](https://fluxml.github.io/Flux.jl/stable/) [![](https://img.shields.io/badge/chat-on%20slack-yellow.svg)](https://julialang.org/slack/) [![ColPrac: Contributor's Guide on Collaborative Practices for Community Packages](https://img.shields.io/badge/ColPrac-Contributor's%20Guide-blueviolet)](https://github.com/SciML/ColPrac) [![DOI](https://joss.theoj.org/papers/10.21105/joss.00602/status.svg)](https://doi.org/10.21105/joss.00602)

[![][action-img]][action-url] [![][codecov-img]][codecov-url]
Copy link
Member Author

@mcabbott mcabbott Sep 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved because these badges wrapped to 2 lines, and seemed to be in random order. This puts the CI ones on 2nd line, and centres them below the logo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think FluxML projects should also include the "number of downloads" badge in their READMEs -

Flux Downloads
Zygote Downloads
NNlib Downloads
Functors Downloads
Metalhead Downloads
FastAI Downloads
... and so on

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could do, 88k sounds like a lot...

README.md Outdated Show resolved Hide resolved
@codecov-commenter

This comment was marked as off-topic.

README.md Outdated
Comment on lines 7 to 9
[![](https://img.shields.io/badge/docs-stable-blue.svg)](https://fluxml.github.io/Flux.jl/stable/) [![](https://img.shields.io/badge/chat-on%20slack-yellow.svg)](https://julialang.org/slack/) [![ColPrac: Contributor's Guide on Collaborative Practices for Community Packages](https://img.shields.io/badge/ColPrac-Contributor's%20Guide-blueviolet)](https://github.com/SciML/ColPrac) [![DOI](https://joss.theoj.org/papers/10.21105/joss.00602/status.svg)](https://doi.org/10.21105/joss.00602)
<br/>
[![][action-img]][action-url] [![][codecov-img]][codecov-url]
Copy link
Member

@Saransh-cpp Saransh-cpp Oct 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it might be better to arrange these badges in a table? Something like this -

Type Badge/Status
CI pkgeval
Documentation pages-build-deployment
Community Flux Downloads ColPrac: Contributor's Guide on Collaborative Practices for Community Packages deps
DOI and version DOI version

But this would take more vertical space

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more logically organised. But I'm not sure these badges are really so valuable. Knowing to click the blue button is a quick way to get to the docs on an unknown package, but perhaps we should trim some of the others.

@CarloLucibello
Copy link
Member

The example feels too smart, but on the other hand it could be more catchy than a simpler mnist example.

@mcabbott
Copy link
Member Author

mcabbott commented Oct 7, 2022

I wondered about MNIST, but I think it'll end up a fair bit longer... and more of the code dealing with loading data etc, rather than core Flux + basic Julia stuff?

Could do linear regression pretty compactly, but that seems less illustrative.

I do think it should be short. Having a long nicely explained example means it starts competing with the docs as an alternative way to learn, and we already have too many paths. Just a minimal thing you can copy-paste on day 1 & see that everything is installed and working.

this is a slightly weird feature of our API... and since we also call logitcrossentropy a loss function, perhaps we should emphasize that loss(x,y) is a new thing just for this model, not a function for all time.
Comment on lines +28 to +32
model = Chain(Dense(2 => 3, sigmoid), BatchNorm(3), Dense(3 => 2)) |> gpu
optim = Adam(0.1, (0.7, 0.95))
mloss(x, y) = Flux.logitcrossentropy(model(x), y) # closes over model

Flux.train!(mloss, Flux.params(model), data, optim) # updates model & optim
Copy link
Member Author

@mcabbott mcabbott Oct 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, this business of defining a loss which, unlike what Flux.Losses calls a loss, closes over the model, was pretty confusing to me when I first tried Flux.

train! needs two different objects with implicit references to the model (and optim ends up with them), but does not seem to see the model itself? That's pretty weird.

IMO we should define crossentropy(m, x, y) = crossentropy(m(x), y) globally, and make Flux 0.14 do this, which is much less confusing:

model = Chain(Dense(2 => 3, sigmoid), BatchNorm(3), Dense(3 => 2)) |> gpu
optim = Flux.setup(model, Adam(0.1, (0.7, 0.95)))  # stores Adam's momenta

Flux.train!(Flux.logitcrossentropy, model, data, optim)  # updates model & optim

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems the way forward to move train! away from Params.

@mcabbott mcabbott merged commit dfc5a7e into master Oct 8, 2022
@mcabbott mcabbott deleted the mcabbott-patch-2 branch October 8, 2022 11:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants